Probabilistic Combination of Classifier and Cluster Ensembles for Non-transductive Learning
نویسندگان
چکیده
Unsupervised models can provide supplementary soft constraints to help classify new, target data since similar objects in the target set are more likely to share the same class label. Such models can also help detect possible differences between training and target distributions, which is useful in applications where concept drift may take place. This paper describes a Bayesian framework that takes as input class labels from existing classifiers, as well as cluster labels from a cluster ensemble operating solely on the target data to be classified, and yields a consensus labeling of the target data. Classifiers are first designed based on labeled data and subsequently, when unlabeled target data is available, the existing classifiers can be effectively applied to it with the aid of a cluster ensemble. This framework is particularly useful when the statistics of the target data drift or change from those of the training data. We also show that the proposed framework is privacy-aware and allows performing transductive learning even when data/models are distributed and have sharing restrictions. A variety of experiments with real-world data show that our framework can yield superior results to those provided by applying classifier ensembles only.
منابع مشابه
A Probabilistic Bayesian Classifier Approach for Breast Cancer Diagnosis and Prognosis
Basically, medical diagnosis problems are the most effective component of treatment policies. Recently, significant advances have been formed in medical diagnosis fields using data mining techniques. Data mining or Knowledge Discovery is searching large databases to discover patterns and evaluate the probability of next occurrences. In this paper, Bayesian Classifier is used as a Non-linear dat...
متن کاملA Probabilistic Bayesian Classifier Approach for Breast Cancer Diagnosis and Prognosis
Basically, medical diagnosis problems are the most effective component of treatment policies. Recently, significant advances have been formed in medical diagnosis fields using data mining techniques. Data mining or Knowledge Discovery is searching large databases to discover patterns and evaluate the probability of next occurrences. In this paper, Bayesian Classifier is used as a Non-linear dat...
متن کاملTransductive Learning from Relational Data
Transduction is an inference mechanism “from particular to particular”. Its application to classification tasks implies the use of both labeled (training) data and unlabeled (working) data to build a classifier whose main goal is that of classifying (only) unlabeled data as accurately as possible. Unlike the classical inductive setting, no general rule valid for all possible instances is genera...
متن کاملPersian Handwritten Digit Recognition Using Particle Swarm Probabilistic Neural Network
Handwritten digit recognition can be categorized as a classification problem. Probabilistic Neural Network (PNN) is one of the most effective and useful classifiers, which works based on Bayesian rule. In this paper, in order to recognize Persian (Farsi) handwritten digit recognition, a combination of intelligent clustering method and PNN has been utilized. Hoda database, which includes 80000 P...
متن کاملTRANS-SC: a Transductive Structural Classifier
The goal of transductive inference (or transduction) is to mine both labeled and unlabeled data within the same learning step and output a classification of the unlabeled examples with few errors as possible. We propose a new transductive classifier, named TRANS-SC, that works in a transductive setting and resorts to a principled probabilistic classification in multi-relational data mining to d...
متن کامل